Liar’s Poker: a tricky problem
Liar’s Poker is a card game we sometimes play around the office. It’s a pretty simple game to learn if you’re already familiar with poker hands: everyone is dealt 3 cards, and we go around in a circle naming higher and higher poker hands (“ace high,” “pair of 2’s,” “pair of 5’s with a queen kicker,” &c). When it is your turn, you can either name a higher poker hand than the previous player or challenge their bid. When someone challenges, we throw all our cards into the middle and try to make the 5-card poker hand described exactly. For instance, if “ace high with a queen kicker” were challenged and there were two 2’s, a 4, a 10, a queen, and an ace, you could make the hand by selecting all the cards except one of the 2’s. However, without the 4, the bid cannot be made: although there is an ace and a queen, the only hand that can be made is a pair of 2’s. If the hand can be made, the challenger loses a card for all subsequent rounds. If the hand cannot be made, the bidder loses a card for subsequent rounds. When you’re out of cards, you’re out of the game. The winner is the last person with cards left.
The game is simple enough that we can all play it pretty easily, but hard enough that it’s nontrivial to write a good AI to play it for you. We’re hoping to have a Liar’s Poker AI tournament pretty soon (we’re about a quarter of the way done writing the infrastructure to get the AIs to interact, though no one has written an AI so far). It’s pretty interesting.
However, a question that has dogged me since I learned the game a year ago is what the optimal strategy for the final round is. That is to say, I and my opponent each have 1 card. What is my best strategy? Assuming my opponent knows my strategy and can act accordingly, what are the chances that I win?
- You (as player 1) probably don’t want a deterministic strategy: if you’re dealt card c, you want to bid hand h with probability a_{ch}, and we’re interested in finding all the a_{ij}’s in order to maximize our chances of winning.
- Player 2 has a very simple strategy: given player 1’s bid, determine the most likely card that player 1 has, and act accordingly:
- If his bid is a pair, challenge it. If it’s not there, you win. If it is there, anything you bid will be challenged and lose, so challenging is never worse than bidding.
- If player 1 bid a high card and your card is higher than it, call BS.
- If player 1 bid a high card and your card is lower than it, either call BS if you think he’s lying or bid his card with yours as a kicker if you think he really has the named card.
- If player 1 bids your card as a high card, bid or challenge depending on what you think player 1 has (assume he has the card he is most likely to have).
- If there are two cards that are equally likely, just pick one deterministically and act as though player 1 has that card. This makes the math simpler, and player 1 cannot profit by this determinism: if he makes the card you choose less likely, it would no longer be tied for the most likely, and your strategy will change.
- Note that if player 1 is interested in bidding a pair, it needs to be a pair of the card he was dealt: player 2 is certainly going to call BS, so you should at least give yourself a fighting chance by making the bid plausible from your perspective.
I talked with Reid about the problem a bit, and our discussion eventually lead to the following formulation:
- Let’s start with some terminology:
- Let there be n cards in the deck, so that there are n/4 different ranks (each having 4 cards, 1 of each suit).
- Let C be the set of cards (ranks, since suit won’t be important) you could be dealt, and let H be the set of hands you can bid
- Let W be the event of Player 2 winning. Player 1 wants to choose a_{ch} for each c in C and each h in H in order to minimize P(W).
- Let H_1c be the event that player 1 was dealt card c, and let H_2d be the event that player 2 was dealt card d.
- Let Bh be the event that player 1 bid hand h
- So now, player 1 wants to choose all of the a_{ch}’s in order to minimize
P(W) = \sigma_{c \in C} \sigma_{d \in C} \sigma_{h \in H} \left(
P(W|H_1c & H_2d & Bh) * P(Bh|H_1c & H_2d) * P(H_1c|H_2d) * P(H_2d)
\right)(forgive my LaTeX-like syntax).
- P(W|H_1c & H_2d & Bh) is the chances that player 2 will win in this particular situation. It is either 0 or 1, depending on whether player 2 can correctly decide whether to believe player 1’s bid, and it will change from 0 to 1 and back again as the a_{ih}’s change for all i in C (player 1 will change actions depending on the most likely card player 2 has). In C++-like syntax, these will typically be of the form
((3*a_{ch} > 4*a_{eh}) ? 1 : 0)
for some other card e.
- P(Bh|H_1c & H_2c) is really just P(Bh|H_1c), since player 1’s bid is independent of the card player 2 was dealt. This is just a_{ch}.
- P(H_1c|H_2d) is either 3/(n-1) or 4/(n-1) depending on whether c and d have the same rank.
- P(H_2d) is always 4/n
- P(W|H_1c & H_2d & Bh) is the chances that player 2 will win in this particular situation. It is either 0 or 1, depending on whether player 2 can correctly decide whether to believe player 1’s bid, and it will change from 0 to 1 and back again as the a_{ih}’s change for all i in C (player 1 will change actions depending on the most likely card player 2 has). In C++-like syntax, these will typically be of the form
- We’re also subject to the constraints that every a_{ch} is in [0,1], and for each card c in C, \sigma_{h\in H} a_{ch} = 1. That is to say, these are probabilities, and they add up to 1.
This looks a lot like a quadratic programming problem (think linear programming, but kicked up a notch) (Edit: it’s very similar to linear programming. I was just being silly). However, the first probabilities in our objective function switch between 0 and 1 without taking on any value in the middle, so it’s slightly more general. The objective function is still continuous: the only times the probabilities flip from 0 to 1 or back occur when the most likely card in player 1’s hand changes, and this happens when two cards have an equal probability of occurring (given player 1’s bid and player 2’s card). However, I suspect that the gradient of the objective function has some discontinuities along these manifolds, which makes everything trickier. I’ve started to look into algorithms to solve quadtratic programming problems, but so far I’m not convinced I could easily adapt one to work here.
Any thoughts on how to solve this? Reid and I haven’t even figured it out for an 8-card deck (containing 4 aces and 4 kings). In this deck, player 1 has a 3/7 chance of winning if he always bids “ace high,” but he also has a 3/7 chance if he always bids a pair of whatever card he has. We suspect there’s a way of combining these and getting better odds of winning by sometimes bidding the ace and sometimes bidding the pair, but we haven’t figured out what it is yet (note that you’re guaranteed to lose if you bid king high or a pair of the card you don’t have). My hope is that if we can solve this for the 8-card deck we can generalize to a full deck of cards.
Edit: It’s really just linear programming, where your objective function is piecewise linear, but also continuous and concave up. This is totally solveable, since there’s only 1 minimum, which is the right solution. This should be solvable with any hill-climbing algorithm, and I’m pretty sure I can adapt the Simplex algorithm to solve it, too. Re-edit: this is harder than I had thought. There are an exponential number of regions, so I don’t want to enumerate them all, and I’m not yet convinced I can solve this without that.
Special thanks to Reid, who has some interesting ideas and has helped me get this far into the problem. It’s surprisingly difficult, given how easy the formulation is.
We suspect there’s a way of combining these
Did you take any game theory?
http://en.wikipedia.org/wiki/Mixed_strategy
Now I’ve gotta go work, but I hope to come back and read this in more detail later.
I have not taken any game theory. I strongly suspect there is always a mixed Nash equilibrium, though.
Earlier today I solved the 8-card deck version, and it doesn’t matter if you use a pure or a mixed strategy; the best you can do is 3/7 change of winning. I’ll hopefully post more details about this soon.
It sounds like in reality this game, like poker, will have nothing to do with math and everything to do with people.
I agree that in the general case (i.e., when the total number of cards dealt is more than 2), any good program will make predictions based on the past behavior of the opponents, and these predictions will play a critical role in whatever strategy it uses. However, if you go first in the final round, you can’t change your behavior based on what your opponent does (by the time he bids, the game is deterministic).
Consequently, I think that for this final round, the best you can do is to use whatever the mathematically optimal strategy is, so that your opponent cannot take advantage of you. I’m assuming that you play the game enough that your opponents can figure out your end-game strategy if they so choose, which is likely to be the case here (the tournament will contain at least thousands of games, if not more).